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Abstract. In a standard Bayesian approach to the alpha-factor model for common-cause 
failure, a precise Dirichlet prior distribution models epistemic uncertainty in the alpha-factors. 
This Dirichlet prior is then updated with observed data to obtain a posterior distribution, 
which forms the basis for further inferences. 

In this paper, we adapt the imprecise Dirichlet model of Walley to represent epistemic 
uncertainty in the alpha-factors. In this approach, epistemic uncertainty is expressed more 
cautiously via lower and upper expectations for each alpha-factor, along with a learning param- 
eter which determines how quickly the model learns from observed data. For this application, 
we focus on elicitation of the learning parameter, and find that values in the range of 1 to 10 
seem reasonable. The approach is compared with Kelly and Atwood's minimally informative 
Dirichlet prior for the alpha-factor model, which incorporated precise mean values for the 
alpha-factors, but which was otherwise quite diffuse. 

Next, we explore the use of a set of Gamma priors to model epistemic uncertainty in the 
marginal failure rate, expressed via a lower and upper expectation for this rate, again along 
with a learning parameter. As zero counts are generally less of an issue here, we find that the 
choice of this learning parameter is less crucial. 

Finally, we demonstrate how both epistemic uncertainty models can be combined to arrive 
at lower and upper expectations for all common-cause failure rates. Thereby, we effectively 
provide a full sensitivity analysis of common-cause failure rates, properly reflecting epistemic 
uncertainty of the analyst on all levels of the common-cause failure model. 



1. Introduction 

Common-cause failure has been recognized since the time of the Reactor Safety Study [5] as a 
dominant contributor to the unreliability of redundant systems. A number of models have been 
developed for common-cause failure over the time since the publication of the Reactor Safety 
Study, with perhaps the most widely used one being the Basic Parameter Model, at least in the 
U.S. 0. 

The alpha-factor parametrisation of this model uses a multinomial distribution as its aleatory 
model for observed failures [9]. The conjugate prior to the multinomial model is the Dirichlet 
distribution. In the standard Bayesian approach, the analyst specifies the parameters of a precise 
Dirichlet distribution to model epistemic uncertainty in the alpha-factors, which are the param- 
eters of the multinomial aleatory model. This Dirichlet prior is then updated with observed data 
to obtain a precise posterior distribution, also Dirichlet. 

In this paper, we follow [11] . and adapt the imprecise Dirichlet model of Walley [13] to represent 
epistemic uncertainty in the alpha- factors. In this approach the analyst specifies lower or upper 
expectations (or both) for each alpha-factor, along with a learning parameter, which determines 
how quickly the prior distribution learns from observed data. We find that values in the range 
of 1 to 10 seem reasonable for this application. 



Key words and phrases, common-cause failure; alpha-factor model; epistemic uncertainty; conjugate prior; 
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Following [TT] , the approach is compared with that of Kelly and Atwood [5] , which attempted 
to hnd a precise Dirichlet prior that was minimally informative |2 ], in the sense that it incor- 
porated specified mean values for the alpha- factors, but which was otherwise quite diffuse. The 
numerical example from [8] is addressed in the imprecise Dirichlet framework, which can be seen 
as an extension of the approach of [5] to the case where a precise mean for each alpha-factor 
cannot be specified. 

Finally, we address the problem — not discussed in [TT] — of inference about actual failure rates. 
These failure rates are rational functions of the alpha-factors and the marginal failure rate per 
component. Modelling failures as a Poisson process, we take a Gamma distribution as conjugate 
prior for the marginal failure rate. Similar to the procedure for the alpha-factors, we can model 
epistemic uncertainty on the marginal failure rate by considering lower and upper expected prior 
failure rates, along with a learning parameter that determines how quickly the prior distribution 
learns from observed data. 

By combining our epistemic uncertainty models for both the alpha-factors and the marginal 
failure rate, we are able to perform a global sensitivity analysis on the common-cause failure 
rates. We provide an algorithm that calculates, up to reasonable precision, bounds on these 
failure rates. The resulting novel procedure is demonstrated on a simple electrical network 
reliability problem. 

The paper is organized as follows. Section [5] reviews the basic parameter model and its 
reparametrisation as the alpha-factor model. Section [3] explores how the parameters of the 
alpha- factor model can be estimated, using Dirichlet and Gamma priors. Section U discusses the 
handling of epistemic uncertainty for the alpha-factors. Two ways to choose a Dirichlet prior 
(or sets of Dirichlet priors) starting from epistemic prior expectations of the alpha-factors are 
considered. Throughout, the main ideas are demonstrated on a numerical example. Section [5] 
shows how, similarly to the alpha-factor case, epistemic uncertainty can be expressed for the 
marginal failure rate. A set of conjugate Gamma priors is elicited by considering lower and 
upper expected prior marginal failure rates. Section [5] describes an algorithm that infers bounds 
on all common-cause failure rates based on our imprecise alpha-factor model and our imprecise 
marginal failure rate model. Section [7] demonstrates our methodology on a simple electrical 
network reliability problem. Section [5] ends the paper with some conclusions and thoughts for 
further research. 



2. Common-Cause Failure Modelling 

2.1. The Basic Parameter Model. Consider a system that consists of k components. Through- 
out, we make the following standard assumptions: (i) repair is immediate, and (ii) failures follow 
a Poisson process. 

For simplicity, we assume that all k components are exchangeable, in the sense that they have 
identical failure rates. More precisely, we assume that all events involving exactly j components 
failing have the same failure rate, which we denote by qj. This model is called the basic parameter 
model, and we write q for (gi, . . . , 

For example, if we have three components, A, B, and C, then the rate at which we see only A 
failing is equal to the rate at which we see only B failing, and is also equal to the rate at which 
we see only C failing; this failure rate is q\. Moreover, the rate at which we observe only A and 
B jointly failing is equal to the rate at which we observe only B and C jointly failing, and also 
equal to the rate at which we observe only A and C jointly failing; this failure rate is q%. The 
rate at which we see all three components jointly failing is q$. 
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In case of k identical components without common-cause failure modes, thus each failing 
independently at rate A, we would hav^3 

(1) qi = A and qj — for j > 2. 

The fact that we allow arbitrary values for the qj reflects the lack of independence, and whence, 
our modelling of common-cause failures. At this point, it is worth noting that we do not actually 
write down a statistical model for all possible common-cause failure modes — we could do so if 
this information was available, and in fact, this could render the basic parameter model obsolete, 
and allow for more detailed inferences. In essence, the basic parameter model allows us to 
statistically model lack of independence between component failures, without further detail as 
to where dependencies arise from: all failure modes are lumped together, so to speak. 

It is useful to note that it is possible, and sometimes necessary, to relax the exchangeability 
assumption to accommodate specific asymmetric cases. For example, when components are in 
different state of health, single failures would clearly not have identical failure rates. Because 
the formulas become a lot more complicated, we stick to the exchangeable case here. 

Clearly, to answer typical reliability questions, such as for instance "what is the probability 
that two or more components fail in the next month?" , we need q. In practice, the following 
three issues commonly arise. First, q is rarely measured directly, as failure data is often collected 
only per component. Secondly, when direct data about joint failures is available, typically, this 
data is sparse, because events involving more than two components failing simultaneously are 
usually quite rare. Thirdly, there arc usually two distinct sources of failure data, one usually very 
large data set related to failure per component, and one usually much smaller data set related to 
joint failures. For these reasons, it is sensible to reparametrise the model in terms of parameters 
that can be more easily estimated, as follows. 

2.2. The Alpha-Factor Model. The alpha-factor parametrisation of the basic parameter 
model |H1 starts out with considering the total failure rate of a component q tl which could 
involve failure of any number of components, that is, this is the rate obtained by looking at just 
one component, ignoring everything else. Clearly, 

k 



k - 1 
.7-1 



For example, again consider a three component system, A, B, and C. The rate at which A fails 
is then the rate at which only A fails (qi), plus the rate at which A and B, or A and C fail (2172), 
plus the rate at which all three components fail (93). 

Next, the alpha-factor model introduces ctj — the so-called alpha-factor — which denotes the 
probability of exactly j of the k components failing given that failure occurs; in terms of relative 
frequency, aj is the fraction of failures that involve exactly j failed components. We write a. for 
(at, . . .,a k )- Clearly, 

(3) <*i = V fc TOT ■ 

Z^=i L)# 

For example, again consider A, B, and C. Then the rate at which exactly one component fails is 
3gi (as we have three single components, each of which failing with rate qi), the rate at which 
exactly two components fail is 3<?2 (as we have three combinations of two components, each 
combination failing with rate (72), and the rate at which all components fail is q^. Translating 
these rates into fractions, we arrive precisely at Eq. (|3"j). 



^This is due to our Poisson assumption, and the assumption of immediate repair: independent Poisson processes 
never generate events simultaneously when we observe failure times precisely. 
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It can be shown that Table C-l, p. C-5]0 

1 JOtj 

(4) gj = 7k=T, ^k It- 

Eqs. (0), ([3]), and (J3J establish a one-to-one link between the so-called basic parameter model 
(q) and the alpha- factor model (qt, a). The benefit of the alpha- factor model over the basic 
parameter model lies in its distinction between the total failure rate of a component q t , for which 
we generally have a lot of information, and common-cause failures modelled by a, for which we 
generally have very little information. 

One of the goals of this paper is to perform a sensitivity analysis, in the sense of robust Bayes 
[31 SJ HI], over a, and to measure its effects on qj. Because the qj are proportional to q t , in 
fact, it turns out to take only very little additional effort to perform a sensitivity analysis over 
a and qt jointly. So, although in many cases of practical interest, we will know q t quite well, 
interestingly, we do not need to assume that we know much at all about qt. 

3. Parameter Estimation 

3.1. Dirichlet Prior for Alpha-Factors. Suppose that we have observed a sequence of N 
failure events, where we have counted the number of components involved with each failure 
event, say n,j of the N observed failure events involved exactly j failed components. We write n 
for (m, . . . , rz.fc). In terms of the alpha- factors, the likelihood for n has a very simple form: 



(5) Pr(n|a) = ij- 



which is a multinomial distribution with parameter a. 

As mentioned already, typically, for j > 2, the rij are very low, with zero being quite common 
for larger j. In such cases, standard techniques such as maximum likelihood for estimating the 
alpha-factors fail to produce sensible inferences. For any inference to be reasonably possible, it 
has been recognized [5] that we have to rely on epistemic information, that is, information which 
is not just described by the data. 

A standard way to include epistemic information in the model is through specification of a 
Dirichlet prior for the alpha- factors [5]: 

k 

(6) /(a|s,t)oc JJaf - 1 

i=i 

which is a conjugate prior for the multinomial likelihood specified in Eq. ([5]). In Eq. ([5]), we use 
Walley's [T31 §7.7.3, p. 395] (s,t) notation for the hyperparameters. Here, s > and t 6 A, 
where A is the (k — l)-dimensional unit simplex: 

(7) 

An interpretation for these parameters will be given shortly. First, let us calculate the posterior 
density for a: 

k 

(8) /(a|n,»,t)«naf +n '-\ 

j'=i 




2 Hint: consider 5Z?=i a i 
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Of typical interest is for instance the posterior expectation of the probability OLj of observing 
j of the k components failing due to a common cause given that failure occurs: 



stj _ N 



(9) E(aj\n,s,t)= / ajf(ct\n, s,t) dot = -^r — - = — tt + — t, 



N + s N + sN N + s J 



where N — Y]j—i n>j is the total number of observations. 



Eq. (O provides the usual well-known interpretation for the hyperparameters s and t: 

• If N = 0, then E(ctj\s,t) = tj, so tj is the prior expected chance of observing j of the k 
components failing due to a common cause, given that failure occurs. 

• E(aj \n, s, t) is a weighted average of tj and rij/N (the proportion of j-component failures 
in the N observations), with weights s and N, respectively. The parameter s thus 
determines how much data is required for the posterior to start moving away from the 
prior. If N <C s then the prior will weigh more; if N = s, then prior and data will weigh 
equally; and if N ^> s, then the data will weigh more. In particular, E{otj\n, s,t) = tj if 
N = (as already mentioned), and E(ctj\n, s,t) — » t& as N — >• oo. 

For inference about qj, which we will discuss in Section [5J we will also need, for natural 
numbers p 1: . . . , p k , with P := £)j=i Vf 

(10) E [[af\n,s,t = 



U' =1 



(iV + s)j 



where (a;)„, for n S No, denotes the raising factorial, also known as Pochhammer's symbol [TJ 
6.1.22, p. 256]: 

(11) (x) n := ^ + j = (.x + n - l)(x + n - 2) . . . (x + l)x. 

T(x) 

By linearity of expectation, Eq. (flU)) allows us to calculate the expectation of an arbitrary 
polynomial in ex. 

3.2. Per Component Failure Rate. Now we turn to the estimation of qt, the total failure 
rate per component. As mentioned at the start of Section [21 we assume that failures follow a 
Poisson process. Suppose we observe M failures of our component over a time interval of length 
T. If M is sufficiently large, then a reasonable point estimate for q t would be M/T. 

Often, that will be enough. However, in case M is not terribly large, we can easily propose a 
conjugate prior for q t . Specifically, the likelihood for M, given T, is: 

(12) Pv(M\q t ,T) = ^J_.^ 

which is simply a Poisson distribution with parameter g t T. 

A standard way to include epistemic information in the model is through specification of a 
Gamma prior 

(13) f(q t \u,v)^qr- 1 e-^\ 



which is a conjugate prior for the Poisson likelihood specified in Eq. (lT2j) . The posterior density 
for q t is: 

(14) f(q t \M,T,u,v) oc #*+Jtf-i e - 9t («+T) 



'We use a non-standard parametrisation to allow easier interpretation of the hyperparameters. 
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Of typical interest is the posterior expectation of qt : 

M ■ 



E(q t \M,T,u,v) = [ f(q t \M,T,u,v)dudv = 

Ju,v U + T 



T M u 

(15) = 1 v 

y ' u+TT u+T 

Eq. (fT5")) provides a straightforward interpretation for the hyperparameters u and v, which 
mimicks our discussion concerning the Dirichlet prior 

• If T = 0, then E(q t \u, v) = v, so v is the prior expected failure rate. 

• E(q t \M, T, u, v) is a weighted average of v and M/T (the empirical observed failure rate), 
with weights u and T, respectively. The parameter u thus determines for how long we 
need to observe the process until the posterior starts to move away from the prior. If 
T«u then the prior will weigh more; if T = u, then prior and data will weigh equally; 
and if T 3> it, then the data will weigh more. In particular, E(q t \M, T, u, v) = v if T = 
(as already mentioned), and E(q t \M, T, u, v) — > ^ as T —> oo. 



4. Handling Epistemic Uncertainty in Alpha-Factors 

Crucial to reliable inference in the alpha-factor model is proper modelling of epistemic uncer- 
tainty about failures, which is in the above approach expressed through the (s,i) parameters. 
We focus on two methods for elicitation of these parameters, and the inferences that result from 
them. 

Throughout, we will use the following example, which is taken from Kelly and Atwood [5]. 
Consider a system with four redundant components (k — 4). The probability of j out of k 
failures, given that failure has happend, was denoted by ctj. We assume that the analyst's prior 
expectation fi spcc ,j for each oej is: 

(16) Mspcc,l = 0.950 Mspec,2 = 0.030 Ms P cc,3 = 0.015 Mspec,4 = 0.005 

We have 36 observations, in which 35 showed one component failing, and 1 showed two compo- 
nents failing: 

Til = 35 77,2 = 1 ^3 = 714 = 

4.1. Constrained Non-Informative Prior. Atwood [2] studied priors for the binomial model 
which maximise entropy (and whence, are 'non-informative') whilst constraining the mean to a 
specific value. Although these priors are not conjugate, Atwood [2\ showed that they can be well 
approximated by Beta distributions, which are conjugate. Kelly and Atwood [8] applied this 
approach to the multinomal model with conjugate Dirichlet priors, by choosing a constrained 
non-informative prior for the marginals of the Dirichlet — which are Beta. This leads to an 
over-specified system of equalities, which can be solved via least-squares optimisation. 

For the problem we are interested in, /x S pcc,i is close to 1. In this case, the solution of the 
least-squares problem turns out to be close to: 

tj = Mspocj for all j e {1, . . . , k} 

(17) 1 

s — — 

2(1 — /U S pec,l) 



In fact, we arrive at similar interpretations because both priors are members of the canonical exponential 
family PITO]. 



A ROBUST BAYESIAN APPROACH TO MODELLING EPISTEMIC UNCERTAINTY IN COMMON-CAUSE FAILURE MODELS 



For our example, this means that s = 10 [5| p. 400, §3]. An obvious calculation reveals that, 
under this prior [5J p. 401, §3.1]: 

E(ai\n,a,t) = = 0.967 E(a 2 \n,s,t) = ±±^£ = 0.028 

3d + 10 36 + 10 

E(a 3 \n,s,t) = + ' 15 = 0.003 E(aAn,s,t) = - + °"° 5 = 0.001 

v 1 ' ' ' 36 + 10 v 4i , , ; 36 + 10 

Kelly and Atwood [HI p. 402, §4] compare these results against a large number of other choices of 
priors, and note that the posterior resulting from Eq. (fl7|) seems too strongly influenced by the 
prior, particularly in the presence of zero counts. For instance, the uniform prior is a Dirichlet 
distribution with hyperparameters tj — 0.25 and s = 4, which gives: 

35 + 1 , ,1 + 1 
E( ai \n, s, t) = —— = 0.9 E(a 2 n, s, t) = — — = 0.05 

36 + 4 3d + 4 

E(a 3 \n, s, t) = = 0.025 Efaln, s, t) = = 0.025 

3o + 4 3o + 4 

Jeffrey's prior is again a Dirichlet distribution with hyperparameters tj = 0.125 and s = 4, which 
gives: 

35 + 0.5 , , 1 + 0.5 
E( ai \n, S ,t) = 36 + 4 ^ 0-8875 E(a 2 \n,s,t) = ^ 0.0375 

E(a 3 \n,s,t) = = 0.0125 E( a4 .\n,8,t) = = 0.0125 

36 + 4 36 + 4 

The degree of variation in the posterior under different priors is evidently somewhat alarming. 
In the next section, we aim to robustify the model by using sets of priors from the start. 

4.2. Imprecise Dirichlet Model. 

4.2.1. Near-Ignorance Model. In case no prior information is available, Walley proposes as a so- 
called near-ignorance prior a set of Dirichlet priors, with hyperparameters constrained to the 
set: 

H = {(s,t):t e A} 

for some fixed value of s, which determines the learning speed of the model [121 P- 218, §5.3.2] 
[H p. 9, §2.3]. 

4.2.2. General Model. When prior information is available, more generally, we may assume that 
we can specify a subset H of (0, +oo) x A. Following Walley's suggestions [T2j p. 224, §5.4.3] 
[13 p. 32, §6], we take 

(18) U = {(s,t): s E [s,s], t e A, tj G \t p tj}} 

where the analyst has to specify the bounds \tj,tj] for each j e {1, . . . , k}, and [s, s]. 
The posterior lower and upper expectations of <x,- are: 



TLj +St, 



(19) E(aj\n,U) = min 



(20) E(aj\n,H) = max 




if tj > rij/N 
if tj < rij/N 
if tj > nj/N 
if tj < rij/N 



For the model to be of any use, we must be able to elicit the bounds. The interval [tj,tj] 
simply represents bounds on the prior expectation of the chance ctj. 
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Fixed Learning Parameter. Typically, the learning parameter s is taken to be 2 (not without 
controversy; see insightful discussions in |13|). One might therefore be tempted to using the 
same prior expectations tj for the ctj as above (Eq. (|16p ). with s = 2, resulting in the following 
posterior expectations: 

E(ai\n,s,t) = 35 + L9 = 0.971 E(a 2 \n,s,t) = 1 + °'° 6 = 0.028 



36 + 2 v " 7 7 ' 36 + 2 

E(a 3 \n,s,t) = = 0.0007 E{a A \n,s,t) = = 0.0002 

36 + 2 36 + 2 



Whence, for this example, it is obvious that s = 2 is an excessively poor choice: the posterior 
expectations in case of zero counts are pulled way too much towards zero. One might suspect 
that this is partly due to the strong prior information, that is, the knowledge of tj. However, 
even if we interpret the given probabilities as bounds, say: 

(21a) fe^ti] = [0.950,1] 

(21b) [t 2 ,t 2 ] = [0,0.030] 

(21c) \hM = [0,0.015] 

(21d) [£4, t A ] = [0,0.005] 

we still find: 

(22a) [E(ai\n,H),E(ai\n,H)} = [0.971,0.974] 

(22b) [E{a 2 \n,H),E(a 2 \n,H)] = [0.026,0.028] 

(22c) [E(a 3 \n,H),E(a 3 \n,H)} = [0,0.0007] 

(22d) [E(a i \n,'H),E(a i \n,n)] = [0,0.0002] 

Clearly, only the posterior inferences about a.\ (and perhaps also a 2 ) seem reasonable. We 
conclude that the imprecise Dirichlet model with s = 2 learns too fast from the data in case of 
zero counts. 

On the one hand, when counts are sufficiently far from zero, the posterior probability with 
s = 2, and perhaps even s = 1 or s = 0, seem appropriate. For zero counts, however, a larger 
value of s seems mandatory. Therefore, it seems logical to pick an interval for s. 

A further argument for choosing an interval for s, in case of an informative set of priors, is 
provided by Walley [HI p. 225, §5.4.4]: a larger value of s ensures that the posterior does not 
move away too fast from the prior, which is particularly important for zero counts, and the 
difference between s and s effectively results in greater posterior imprecision if nj/N ^ [L-,?j]- 

To see this, note that, if tj < rij/N < tj, it follows from Eqs. (fHj|) and (|20[) that both lower 
and upper posterior expectation are calculated using s. When nj/N < tj (or tj < nj/N), the 
lower (upper) posterior expectation is calculated using s instead, which is nearer to rij/N due 
to the lower weight s for the prior bound tj (tj). The increased imprecision reflects the conflict 
between the prior assignment [tptj] and the observed fraction nj/N, and this is referred to as 
prior-data conflict (also see [14]). 

Interval for Learning Parameter. We follow Good [TJ p. 19] (as suggested by Walley 12, Note 5.4.1, 
p. 524]), and reason about posterior expectations of hypothetical data to elicit s and s; also see 
|12[ p. 219, §5.3.3] for further discussion on elicitation on s — our approach is similar, but simpler 
for the case under study. We assume that t\ = 1 and tj = for all j > 2. 
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The upper probability of multiple (j > 2) failed components in trial m + 1, given one (j = 1) 
failed component in all of the first m trials, is 

■§ J . 

Efaj |m = m,N = m,H) = J — 

m + s 

(Note: there is no prior-data conflict in this case.) Whence, for the above probability to reduce 
to tj/2 (i.e., to reduce the prior upper probability by half), we need that m = s. In other 
words, s is the number of one-component failures required to reduce the upper probabilities of 
multi- components failure by half. 

Conversely, the lower probability of one (j — 1) failed component in trial m + 1, given only 
multiple (j > 2) failed components in the first m trials, is 



E{ax\nx =0,N = m,H) = 



st, 



(Note: there is strong prior-data conflict in this case.) In other words, s is the number of multi- 
component failures required to reduce the lower probability of one- component failure by half Note 
that, in this few alternative interpretations present themselves. First, for j > 2, 

— , m + stj 
EiaAn-i = m. N = m.H) = — 

v 3 ' m + s 

In other words, s is also the number of j-component failures required to increase the upper 
probability of j components failing to (1 + tj)/2 (generally, this will be close to 1/2, provided 
that tj is close to zero). Secondly, for j > 2, 

m 

EtaAnj = m, N — m, H) — 

> m + s 

so s is also the number of multi- component failures required to increase the lower probability of 
multi- component failures to a half. 

Any of these counts seem well suited for elicitation, and are easy to interpret. As a guideline, 
we suggest the following easily remembered rules: 

• s is the number of one-component failures required to reduce the upper probabilities of 
multi-component failures by half, and 

• s is the number of multi-component failures required to reduce the lower probability of 
one-component failures by half. 

Taking the above interpretation, the difference between s and s reflects the fact that the rate at 
which we reduce upper probabilities is less than the rate at which we reduce lower probabilities, 
and thus reflects a level of caution in our model. 

Coming back to our example, reasonable values are s = 1 (if we immediately observe multi- 
component failures, we might be quite keen to reduce our lower probability for one-component 
failure) and s = 10 (we are happy to halve our upper probabilities of multi-component failures 
after observing 10 one-component failures). With these values, when taking for tj the values 
given in Eq. (|16p , we find the following posterior lower and upper expectations of ctj : 

(23a) \E{a x \n, H),E{a x \n, H)] = [0.967, 0.972] 

(23b) \E(a 2 \n,H),E(a 2 \n,H)} = [0.0278,0.0283] 

(23c) [E(a 3 \n,n),E(a 3 \n,H)\ = [0.00041,0.00326] 

(23d) \E(a4n,n),E(a4n,H)} = [0.00014,0.00109] 

These bounds indeed reflect caution in inferences where zero counts have occurred (j = 3 and 
j = 4), with upper expectations considerably larger as compared to the model with fixed s, while 
still giving a reasonable expectation interval for the probability of one-component failure. 
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If we desire to specify our initial bounds for tj more conservatively, as in Eqs. (|21[) . we find 
similar results: 

(24a) [Eia^n.-Wj.Eia^n.U)] = [0.967,0.978] 

(24b) [E(a 2 \n,M),E(a 2 \n,H)} = [0.0270,0.0283] 

(24c) \E(a 3 \n,H),E(a 3 \n,H)] = [0,0.00326] 

(24d) \E(cu\n,n),E(cu\n,H)] = [0,0.00109] 

5. Handling Epistemic Uncertainty in Marginal Failure Rate 

Before we can consider inferences on the common-cause failure rates qj , we will briefly explain 
how we express epistemic uncertainty on the marginal failure rate q t . As seen in Section 13.21 
we will use conjugate Gamma priors with hyperparameters u and v, where v is the prior failure 
rate parameter, and u determines the learning speed. Similarly to the alpha-factor case, we can 
express vague prior information on q t by considering sets of priors, which are generated by sets of 
hyperparameters, i.e., we specify a parameter set J C (0, oo) x (0, oo). Unlike Section 14.2.11 here 
J = {u} x (0, oo), for some fixed value of u, does not lead to a practically useful near-ignorant 
set of priors, as then E(qt\M,T, J) = oo for any M and T. In practice, it should not be a big 
issue to find bounds [v,v] for the prior expected marginal failure rate. 

Similarly to Eqs. (|19p and ([2"u]) . when J = \u,u] x \v,v], the posterior lower and upper 
expectations of qt are 

(25) E(q t \M, T, J) — min 



M + uv M + uv] \ "^fSr if V- ^ M / T 



T + u 1 T 



(26) E(q t \M, T, J) = max 



T + u ' T 




if v < M/T 



M + uv M + mJl I TW if v> M/T 

if 17 < M/T 



T+v 

To elicit bounds for the learning parameter u, similar considerations as in Section 14.2.21 can 
be made. Assuming v = 0, the posterior lower expectation for q t is 

M 



(27) E(q t \M,T,J) = 

1 + u 

(Note: there is no prior-data conflict in this case.) Whence, u is the amount of time needed 
to observe the process until we raise the lower expectation of q t from to half of the observed 
failure rate M/T. 

Conversely, assuming v > 0, and no failures at all during time T, the posterior lower expecta- 
tion for q t is 

UV V 

(28) E{q t \M = Q,T,J) = -^ = T =- 

T + u - + 1 

(Note: prior-data conflict is present in this case.) Whence, u is the time needed to observe the 
process — without any failures — until v is reduced by half. 

Contrary to the situation in Section 01 zero counts are much less of a concern when estimating 
the marginal failure rate. Whence, for sake of simplicity, it might therefore suffice to consider 
parameter sets of the form 

(29) J = {u}x\v,v] 

only. Both Eqs. (f2"7| and (|28p can then serve to determine u — u = u. 
A numerical example will be given in Section [7J 
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0.84 
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0.7 
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0.4 
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0.6 
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0.5 
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0.5 


0.75 


0.6 


0.63 
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0.76 
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0.59 
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0.79 
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0.56 
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0.84 


0.9 


0.53 


0.1 
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Table 1. Accuracy of first and second order Taylor approximations. 



6. Inference on Failure Rates 



6.1. Expected Failure Rates. For inference on the failure rates qj, we will now combine our 
models for alpha- factors and marginal failure rate by using Eq. (QJ. The problem in doing this 
is that there is, as far as we know, no immediate closed expression for the posterior expectation 
of qj, because Eq. (j4|) is a rational function of a. However, naively, we can approximate it using 
Taylor expansion. Specifically, 

1 joti 

-qt 



(30) 

(31) 
(32) 



Qj 



(?-i) J2i=i £a e 



1 



3 a j 



(■:!) Ej=i("/ + ('-!)<*<) 



Qt 



(5=i) 



1 + E?=a(*-1)«« 



Qi 



and, as long as E/=a(^ — 1)°^ < 1 — ^ n ^ s i s always true if k < 2; for larger k, it is usually 
true because at is usually very small for I > 3 — we can use the Taylor expansion 1/(1 + x) = 



1 — x + x 2 — a; 3 + . . . (valid for \x\ < 1), to arrive at: 



(33) 



1=2 



1=2 



Qi 



The posterior expectation of Eq. (j3"3"]l can now be evaluated, using Eqs. ([TU]) and (fi3|) . under the 
usual assumption that q t is independent of the alpha-factors. 

To get a better idea of accuracy, Table [1] tabulates first and second order approximations. For 
example, second order approximation remains fairly accurate for ^2^ =2 (£ ~ l)a^ < 0.5, and first 
order approximation for ^2g =2 (£ — l)ct^ < 0.3. 

An obvious issue with Taylor approximation is that the domain of integration includes values 
for a where the Taylor series does not converge. However, it is easy to see that, for any x > 
(not just those for which \x\ < 1): 

1 



(34) 
(35) 



< (1 



-... + (-*)*)- 



1 



< x 



P+i 



< 



1 



l + x 



(1 — X + x~ 



+ (-x) p ) < x pJ < 



for even p, and 
for odd p. 
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Therefore, for any non-negative random variables x and y: 

(36) <E[y(l -x + x 2 + {~x) p )\ - E (t^) ^ E {v xP+1 ) for even P, and 



(37) <E i^-j—j - E[y{l-x + x 2 + (-x) p )} < E{yx p+1 ) for odd p. 

So as long as the expectation of yx p+1 is small enough, taking the expectation over the Taylor 
expansion, of order p, will provide a reasonable approximation. 

As an example, for the special but important case of k — 2, we derive expressions for the 
posterior expectation of q\ and q2, under second order approximation, along with error term: 

(38) E(qi\n,s,t;M,T,u,v) 

(39) «£(ai [l-a 2 +a 2 1 ]\n,s,t)E(q t \M,T,u,v) 

_(n 1 + sti (m + sti)i(rc 2 + st 2 )i , {ni + sti)i{n 2 + st 2 ) 2 \ M + ' 



N + s (N + s) 2 (N + s) 3 J u + T 

_ ni + sti ( n 2 + st 2 ( n 2 + st 2 + 1 \\ M + uv 

^ ' N + s V ~ N + s + l V iV + s + 2 J/ u + T 
up to an absolute expected error less than 

(42) E(oti<4\n,s,t) E(q t \M,T,u,v) 

_ (ni + stiKng + gtgXng + + l)(n 2 + st 2 + 2) M + w 
(JV + s)(JV + s + l)(iV + s + 2)(JV + s + 3) u + T 

and similarly, 

(43) B(gs|»,s,t;M,T,«,«) 

(44) ^E(2a 2 [l-a 2 +a 2 2 ]\n,s,t) E(q t \M,T,u,v) 

n (n 2 +st 2 (n 2 +st 2 ) 2 , (n 2 + st 2 ) 3 \ M + ■ 

(45) = 2 



7V + s (iV + s) 2 (iV + s) 3 / u + T 
, 4g , _ ^ "2 + st 2 A n 2 + .st 2 + 1 / n 2 + st 2 + 2\\ M + uv 



N + s \ N + s + l V. N + s + 2 J J u + T 
up to an absolute expected error less than 

(47) E (2a 2 al \n,s,t) E{q t \M, T, u, v) 

Jn 2 + st 2 )(n 2 + st 2 + l)(n 2 + st 2 + 2)(n 2 + st 2 + 3) M + • 



(N + s)(N + s + 1)(JV + s + 2)(N + s + 3) u + T 

6.2. Sensitivity Analysis. As mentioned in Sections 0] and [SJ due to epistemic uncertainty, 
generally, an analyst specifies bounds for the hyperparameters s, t, u, and v. The parameter sets 
are, as before, denoted by H and J . As we assumed qt and a to be independent (see Section UTTj) . 
we can seperate the analysis into two simpler problems. We first calculate lower and upper 
bounds on the expectation of the terms depending on a, based on the results from Sections H] 
and 16. II Independently, we calculate lower and upper bounds on the expectation of q t as we did 
in Section[5] These bounds uniquely determine E_(qj \n, M, T, H, J) and E(qj \n, M, T, H, J) : as 
follows. 

For convience of notation, define 

(48) 9j ( a ):=-±--p—. 

(j-l) 2^=l«*<! 
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Clearly, by Eq. Q, 

(49) qj=9j(oc)q t 
so, 

(50) E( qj \n,M,T,U, J) = E( 9j (a)\n,H)E(q t \M,T, J) 
where 

(51) E_(gj(a)\n, TL) = min E(gj(a)\n, s, t) 

(52) E(q t \M,T,J) = mm E(q t \M,T,u,v) 

Similar expressions for the upper expectation hold as well by simply replacing min by max at all 
instances. When using Taylor approximation, an upper bound on the error term follows readily 
as well (see example in Section [7]). 

As seen in Section [5l the optimisation problem in Eq. (I52[) (and its counterpart of the upper 
bound) can be done exactly, using Eqs. (|2"5)) and (|2"o]) . In contrast, the optimisation problem 
in Eq. (1511) (and its counterpart of the upper bound) is not so obvious, and we have to rely on 
standard numerical algorithms for non-linear optimisation. However, the particular form of % 
we assumed in Section l4.2.2l (see Eq. ([18])) makes this optimisation problem fairly easily solvable 
by computer. 

7. Example 

To conclude the paper, we demonstrate our methodology on a simple electrical network re- 
liability problem. Numbers are fictional, yet are representative of a typical network in the 
North-East of England. 

A group of customers is supplied with power from two identical distribution lines. Supply is 
lost when both lines fail. Nationwide statistics show typical failure rate of similar distribution 
lines to be within ±50% of 0.35 per year. Nationwide statistics also show the typical fraction of 
double failures to be between 10% and 20%. On the actual system under study, over the last 12 
years, 11 failures were observed, 3 of which were double failures. 

A typical quantity one would be interested in is qi = g-2{oi)qt, the rate of double failures, as 
this is also the rate at which customers lose power. 

For the lower and upper expectation of qt, we take u = 3 (in years) — this means that we need 
about 3 years of data before we start moving away from our prior. For v, we take [0.175, 0.525], 
that is all values within ±50% of the nationwide average 0.35. We find0 

(53) E(q t \M, T, J) = U = 0.538 

(54) 7^„y.;/ : " ,577 

For the lower and upper expectation of g2(&), we take [1, 4] for s0 [0.8, 0.9] for t\, and [0.1, 0.2] 
for t 2 . 

Our choice of s = 4 means that after observing four single failures (and no double failures) , 
we are prepared to reduce the prior upper fraction of double failures (0.2) by half. Our choice of 
s = 1 means that after observing one double failure (and no single failures) , we are prepared to 
reduce the prior lower fraction of single failures (0.8) by half. 



^We have two lines, each observed 12 years, with 8 single failures on either line, and 3 failures occurring in 
both lines; whence, marginally, we observed 8 + 3 X 2 = 14 failures of a distribution line over a total timespan of 
24 years. 

^In this simple example, we have no zero counts, so we can do with a lower upper bound for s. 
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Then, using the 4th order Taylor approximation of g 2 (<*) as explained in Section [6. 11 

(55) E(g 2 (a)\n,'H) = min E(g 2 (a)\n,s,t) 

(s,t)e« 

n n 2 + st 2 f n 2 + st 2 + l( n 2 + st 2 + 2 
mm 2— — 1 — 1 — 



( s ,t)e« iV + s V N + s + 1 V iV + s 
n 2 + st 2 + 3 A n 2 + st 2 + 4 



(56) 1 JV+ S + 3 V JV + s + 4 

(57) = 0.360 

where N = 11, and n 2 — 3. A similar expression holds for _E(g2(c*)|n, H); simply replace min 
by max: 

(58) E(g 2 (a)\n,H) = 0.410 

Both expressions for lower and upper expecation are accurate up to the following absolute error: 

(59) max 2 ff + = 0.006 
Concluding, 

(60) 0.190 = (0.360 - 0.006) x 0.538 

< E(l2\n, M, T,H,J)< E(q 2 \n, M, T,H,J) 

< (0.410 + 0.006) x 0.577 = 0.240 

or in other words, double failures occur at an expected rate that lies between 0.19 and 0.24 per 
year. 

A similar analysis for q\ yields: 

(61) 0.318 = (0.595 - 0.003) x 0.538 

< E( qi \n, M, T, H, J) < E(qi \n, M, T, H, J) 

< (0.643 + 0.003) x 0.577 = 0.373 

or in other words, single failures occur at an expected rate that lies between 0.318 and 0.373 per 
year. 

In this simple example with two redundant components, posterior imprecision for the single 
failure rate is similar to the posterior imprecision for the double failure rate. This is essentially 
a special feature of the two component case, because it must hold that a± + a 2 — 1 when k = 2. 
In case of larger fc, the differences in posterior imprecision between common-cause failure rates 
will be considerably larger, as in the numerical examples of Section 14.21 where, for instance, in 
case of Eq. (HQ), E(aj\n,H) - E(aj\n,H) ranges from 0.001 to 0.011. 

8. Conclusion 

We studied clicitation of hyperparameters for inferences that arise in the alpha-factor repre- 
sentation of the basic parameter model. For the hyperparameters of the Dirichlet prior for the 
alpha-factors, we argued that bounds, rather than precise values, are desirable, due to infer- 
ences being strongly sensitive to the choice of prior distribution, particularly when faced with 
zero counts. We concluded that assigning an interval for the learning parameter is especially 
important. In doing so, we effectively adapted the imprecise Dirichlet model [13] to represent 
epistemic uncertainty in the alpha-factors. 
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For the marginal failure rate, the second part of the model, we proposed a set of Gamma 
priors with similar properties as the set of Dirichlet priors used for the alpha- factors. As zero 
counts are generally not an issue for this part of the model, it may suffice to consider a fixed 
learning parameter here. 

We identified simple ways to elicit information about the hyperparameters, by reasoning on 
hypothetical data, rather than by maximum entropy arguments as done in an earlier study [5J [5] 
on the estimation of alpha-factors. Essentially, the analyst needs to specify how quickly he is 
willing to learn from various sorts of hypothetical data. 

Taking everything together, we arrived at a powerful procedure for analysing the influence 
of epistemic uncertainty on all common-cause failure rates, the central quantities of interest in 
the basic parameter model. As there is no immediate closed-form solution for the expectation 
of these failure rates, we presented an approximation based on Taylor expansion, and quantified 
the error of the approximation at any order. 

By allowing the analyst to specify bounds for all hyperparameters, along with clear inter- 
pretations of these bounds, we effectively provided an operational method for full sensitivity 
analysis of common-cause failure rates, properly reflecting epistemic uncertainty of the analyst 
on all levels of the model. The procedure was illustrated by means of a simple electrical network 
example, demonstrating its feasability and usefulness. 

In the paper, we chose the sets of hyperparameters to be of a very specific convex form 
(Eqs. (fT8]l and ((29)) '). This led to simple calculations (at least for this problem), and made 
elicitation fairly straightforward. Nevertheless, other shapes could still provide a better fit to any 
given epistemic information, and perhaps also have better updating properties. Such shapes may, 
however, be more difficult to elicit. More general shapes for sets of Beta priors are discussed in 
[15] . Already for Beta priors, elicitation of these shapes is non-trivial, and provides an interesting 
challenge. We leave a thorough study of such issues, for Dirichlet and Gamma priors, to future 
work. 

Another aspect we neglected in this paper is the calculation of (imprecise) credible intervals. 
We expect that some clever approximation procedure may be needed. 
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